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pm co nt ol t-ratios exceeding 1 — a 1 ph»i superscript t ti — 2 rot those 
cases was round when the null hypothesis vis true. The results or 
this empirical study wen* then compared with the percent ot t-ritios 
'xeoodii.j 1-alphi superscript t N - / values lor the null hypothesis 
rfith homo junoous variances. Tae trends visible from the results load 
to the conclusion that it possible hutero jeneir. y ot error variance is 
suspect id then conservitivo nominal si j ni t ica nco levels should :>e set 
if the In A model is to be used in detoi ininimj the o t feet or a 
t.t eit.nen t . (Author ) 
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Violation of Homo gene i ty of Variance 
Assumption in the Integrated Moving 
Averages Time Series Model 

The lime Scries Quasi -experiment is a method for evaluating the 

change in level between two success! /e points in a time series. Observations 

arc taken at equal Jy spaced time intervals and one wishes to make inferences 

about a possible abrupt shift in level of direction or drift of the time series 

associated v;ith the occurrence of the introduction of an event at a point in 

time, Donald T, Campbell and Julian C, Stanley presented Ibis Interrupted 

Time Series Design in Chapter 5 'Experimental and Quasi-experi mental Designs 

for Research on Teaching" in the Han dbook of Research on 'leaching ( 1963), 

hi agrammaticnily i the design of the time-series quasi -experiment is as 

follows: z, , x.. • , • 7 . T z , z , Where z. represents 

1 l n \\y¥l* nj + n 9 . ] 1 

the j_lh obsetvatlon of n variable and T represents the "tren tmeut , " 

It the trend of the pre-C observations is altered sharply by the 
introduction of T, v:e will attribute the alteration (whether a change in 
icvcl or in direction of drift) to T. A particularly important problem is 
to determine whether the activity of the time-series near 'f indicates a 
genuine effect of T or merely an orderly continuation of the time— series • 

The problem is "particularly important" because the inferential statistical 
intuitions of social scientists seem seldom to have been developed on non- 
independent observations (such as those in most time-series). Hence, statis- 
tical significance tests are necessary overseers of one's "considered 
linprcf n Ions 11 of the data, 
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box and Tiao (1965) developed a method of evaluating the change in 
Level between two successive points of a non-statiouary t ime-sories , 
observations z an taken at equally spaced points in time and inferences 
are to lie made about a possible shift in level of the time- series associated 
with the occurrence of the event V, This method appears to be the most 
suitable method now available for analyzing the time-series quas i-expe ri ments . 
ft 1ms been used as a method of analysis in several published .studies, Tv/o 
studies of note are: " Analysis of the Connecticut speeding crack flown as a 
t i i.ic-Kcr ies quasi. experiment" by C;ene V Glass in the August l%f> haw and 
Society Kovlew, and "Analysis of data on the 1900 revision of the German 
divorce lav/s as a quasi -experiment M by Gene V Glass, George C, Tiao, and 
Thomas (>, Haguire, haw and Society Review (in press). 

The node! under lying the Box-Tiao analysis of change Ln level of a 
t tnie**ser i es is the integrated moving averages (I.'IA) model. Essentially the 
model implies that the system is subjected to periodic random shocks (with 
zero mean). Tiro initial impact of these shocks on the system is no tea -is 
<< t . Some proportion ».* of these shocks remains in the system and has a 
positive or negative effect on tire system over time, consequently 
-!<■{)< 1, In tern.s of these random shocks, the difference between the 
value of two observations, one at time t, the oilier at time t-1, may bo 
written as 

7 t ~ T-i “ \ ' 4 Vi* 

ibis equation may he solved for z as a function of ti.e a*s alone, 

In order to facilitate solution for two operators are employed In 

the following equations; they are the backward shift operator H, which is 

defined as II/. « /. hence ll^z. * z ; and the backward difference operator 
t t t-m 

1 which can be written in terms of li since 



/m t “ y t 



t-1 



(1 -H)s . 
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In turn, V has fur its inverse the summation operator S given Hy 

■r 1 z - s *=.>! z 

L J“0 C-.l 

" Z t + z t-l + Z t-V + • ■ • 

- (] + 1 } + iv“ + H : + . . .)/ 

* (i - 

'iht; IMA equation written in operator notation then is: 

* ( l - 411) 7, 

t t 

There is some advantage in writing the right side of the equation in terms 
of V instead of d. 

(1 - $B) - (1 - ;)Ii + ( 1 - li) » ( i - $)1S + v «* yli + V 
where i a 1 - and therefore 0 y < 2. Subs 1 1 tvi ting into the equation 

22 (yd + V) i-i t 

V5! t = Y "t-l + v,1 t 



f ron equal ion (0,1) 



\ “ v ~ : (y ' i- 1 + Vj t ) 

z ** V “ * v a , i r n 
t 1 L~J t 

i 10 

V * 1 x rz }, u 

L J“0 t-.j 



therefore 



*L " \\h °J +a t 



If we express the model in terms of n's entering the system after the time 
origin k we obtain 

z t = L + ^Svj + \ 



in which thu constant, 1. , Is the value of the syaten at the ortj’ln tlioe k. 
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by setting k « (J , the model's equation can he written us in; 1 , tin? following 
no tat ion : 1 

* t “ L + w* + v 

iliuti the first observation recorded would In* ^ - J. \ and I'm I hr* n^ 



observations prior to the introduction of a Treatment T 
L r l , 

\ “ + r i=l a i + a C 



for t: lie. observations following T 

z, = t l ’) 4- Y*dha, + a 
t i=l i t 



(1) 

( 2 ) 



where ; 

z^ is the value of the variable observed at time t, 

L is a fixed but unknown 1 oration parameter, 

Y is a parameter descriptive of the decree of i pterdnpemlcuce 

of the observations in the t i ne-sor i cs and tal es values 

0 <■ Y < 2, 

a is a random nor null deviate with Mean t) and variance of 

o is the chan ge in level of t he t imp-series unused b v T, 



Data which conform to the model in (1) ami ( 2 ) are such that the j;rnph 
of the time-series follows an eratie, somewhat random path with slight. , 
but^ no systematic drj-fts, trends, or cycles. Data vrMch show a systematic 
Increase or decrease over Limc--sueh as population and various growth 
curved— violf-tc the .assumption of zero mean for the random vaiiable i . Tor 
generality, the random variable novtiou of the node] can bo allowed Ln 
assume an expected value other than zero; thus 'Vlrl 1 t i mo-scr lcs~-thosc 

showing a constant vise or fall over time-can be accommodated. The p.ormraH- 
zation of the model in (1) ami (?.) is called the " 1 nteq rated rinvinp. nveraqr* 
model with deterministic drflV'^ and tnkr:,s the following form: 
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t~] 

2 i = L + ■ , a,lii x t = L '• : .: t v> 1 ' 



for the n ^ observations prior to the introduction of T, and 



v. + \. i '■ t 

t 



1.-1 

V 

i -1 



t * 



( o 



('») 



(or the n? = jJ - observation 1 ? foj lowing T, 

where L , y and S arc interpreted as in the. model fu (l) and {?), hut now 
_d is a normal variable with variance o*’ anti mean espial r. o n. 

The parameter g describes the rate of ascent or decent of the time- 
series , 

Tt is i Humiliating to express ff as u f a anti man ipu late ( O into a 
fo rm simi lar to ( 1) : 

l- l 

<' t = b f pv(t.-l) + V H- y 7 . H* u f ()) 



One. sees by inspection of (5) that the time-set fea in (5) will ho 
expected to have "drifted" pyt units at time _t. 

This model can again be modified no that a parameter descriptive of 
a chmijje in |j, the drift of the series, is incorporated, Tt is ihcn nossiblo 
to estimate all of the parameters in the inode l for a given value of y and 
to test hypotheses about each, 

bet z t denote the observation of a series at Lime t, prior In the 
introduction of a treatment T: 

t-l 

7 . r. + YU (t - l) + p + Y ■: a 4 m , ( )) 

i-i c 

where the interpretation of the elements of the model are identical to their 
interpretation given earlier in this paper. The following model is descriptive 
of the behavior of the series for the n, observations following the introduction 
of T: 
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z - L H* > \i (t - 1} f v + y\ (t - n. 
t * 



t-1 

i) + A + v )* a ± f a + A 



( 4 ) 



'/here * is the change of level of the series between time n j and + 1, 
and A is the change in the drift of the series between these two times. 
Prior to T, flu* series drifts (on the average) at a rate of y;j units (tin 
or down depending on the sign of n) for each unit of tine; after T, the 
series drifts y(w + A) units on the average for each unit of time. 

Interest in this model generally centers on obtaining estimates of 
the parameters 6 and A, In order to do this, a collection of n^ + 
observations are made; these values of z arc then transformed for a given 
value of y us follows: 



by expanding tin's equation in terms of L , 6 # \i t and A it can be seen that 
the structure of a typical y is 



'ihe model, now in the form y , may be written as Y - \<J + a where X is 
defined as a N x A matrix of weights , b is a 4 x 1 vector containing elements 
ii, A, h, and 7 , and u is a N x i vector of random deviates. The equation in 
vector notation is as follows. 



( 5 ) 



t-1 

y = A - > y, (J - y) 

i= 1 



i-1 





v * U 1 A + (i .* y) 1 ^ 1 U (1 - y) t ~ n r 1 4 1 a 



t 
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X » X 6 + a 

10 1 0 
10 (1 “ y) 0 



y n T -1 



V 

- n , 



10 (1 - Y) n l ' 2 0 



10 (1 - Y )" 1 " 1 0 



'n. + l 



II (1 - y) n I 1 



VV 1 



n l +,l 2 



(1 - Y ) n ^» 2- 2 (1 .. y ) » 2-2 



11 (1 - Y) n l +n2-1 (1 - 



Y ) 



iij+n.. 



l-Hth the '.wdt'l now in tuis form, when y is known, simple leas t- squares 
estimates of p, A, L, and 6, can be determined from the familiar solution 
to the least-squares normal equations: 
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\i • 

* I 

= (X T X) _I v <0) 

\ t 

i 

J ' i 



The "residua) variance 1 ' in fitting the model in (3) and (4) to the 
observations a is given bv 

fi‘- ~ [(y - Xu) 1 (y - X ) j / (n^ + u. ? - 4) * (V) 

lhe following dis t ribu tionnl statements about the estimates of the 
parameters follow from the assumption of normality of r; and traditional 
nnriipljng theory: 



li - \i 




n r f1l 2’' / * > 






L n i+iw 



i/c'tl 



c n t +n r ^ 



tn l +n 2“^» whore 

c l i is the j_t !> diagonal element of 

The above results follow from the linear mode] Y « '.\ (i ♦ i in which 
the urruvs, are assumed to he normal, hoMinscod.is l ( c > and independent. 
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AIL of the above? operations on the linear node! are made for a 



given value of y . When 7 is unknown (ns will generally he true) a 
Jlayi'iian analysis us Cup; sample, iti format ion about 7 is used iri id;in;» 
inferences about A aiul A. The posterior dis trilHition,h(y | y.') , of v divert 
a set of A observations and assuming a uniform prior distribution is knovm 
to with in a const, uit of proper ti 0n.1l I Ly , The posterior u is trlhution of 
t assuming a uniform prior (In which cane the posterior ills tribal Ion 
Is ccpij valent to t!ie .likelihood <11 s tr ilml ton of v ) is given to wfthln 
.1 coins taut of proportionality !>y the following formula: 



for the .simple integrating moving average model with deterministic 
drift In (4) appear in box and TJao (10bS) and Maguire and fliass (1%7), 



It Hiding tbo model V 3 \0 + a, when the value of y Is l.nnwn, 
least squares ns t (notes of 11 , L, A, and S, nay bo determined . Those 
os 1 1 mat os depend on the correctness of the assumptions of norm H tv, 
hoi us< udast ie i 1 y , and f ndependenen of the random normal variable* u, 
ilio roimstnesj (abilfly to stand under violations of Ibeso ass umn l ions) 
of the nod is 1 has been e.stonsivoly studied by persons interested In the 
analysis of variance model. However, the IMA Model's use of 7 and its 
I'iothod of obtaining observations across equally spaced time intervals 
necessitates study of robustness consider, *it ions not touched upon in 
those studies, 

o 




(b> 



illustrations of how the posterior distribution of y in (9) is 



considered jointly with ^ an<1 /» in making inferences about am! A 



The Problem 
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V he i nd c [>ei id encc assumption for the iJox-Ttao method may ho checked 
through the use n f autocorrelations, and in at least some cases steps may 
ho taken to overcome violations of it (Jio:: and Jenkins 1970, pp 170-177; . 
Study of violations of the normality assumption, while not previously 
studied for this specific method was passed over in favor of variance 
violations'., which at this time appear to have a greater probability of 
reveal inn non-robustness* In the context of the general linear model, 
f he hnmofjnnicity of variance assumption has been studied with regard 
to violations across treatment levels, it has not been studied for 
violations of homop.onie ty of variance within each treatment level. Since 
In time-sorus a, nasi -experiments that type of violation can occur and 
uty be n cause for concern, it is beinj; investigated he.ro. 

rjp,ur<i 1 is a ^raph of observations v.^ versus time of observation t, 
l ho population variance nf the pro treatment a t values tins neon increased 
In equal iieronents from ^ at t=l to 10?^ a; t-25* The population 
vi r lance of all the pus l treatment x t values was held constant .it ’» 0 ‘. 

The ( value used in obtaining the data for I*i pure 1 was y-. r ). 

If furtively this ne.ma that half of the mujiii I ude* of each observation 

was stored In the system and affected the tr/iyui tnde of following 

observations, In general this me , 11 s that values coninp frur:: a 

population witii a larger variance wi 1 1 have* a greater chance of havlnp both 

a larger Initial impact and consequently a larger carryover effect on 

following observations than would values coming Inn a population 
with snail variance. Consequently one would expect the slop* and level 
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of i\ series of ohurTViit ions to be more read! ly affected by random 
deviations which come from a population having a large variance. This 
would ho particularly true if the number of observation*; talon ms snail 
and the value of y Jarpe. In Figure l the treatment effect '*■ and 
A were 7 .ero, y».t to the eye it appears that there nay be bi.th a change 
in level and a change in slope A of the graph. This study investigated 
the effect of several situations similar to the one noted in figure 1 , 

Proc edure 

The four parameters, baseline 1., the slope t., and the treatment 
effects, change \n level. and change in slope A, were set at 1 = 0 , 
ii = 2, ^ = 0 f and A = 0 . the c# t values t’en? drawn from a pool (if random 
normal numbers, with mean zero and variance o Q ^, then multiplied by a 
value /c^ in order to obtain values with heterogeneous variances. 

The method then used was as follows: 

1, Given the true null hypotheses A = 0, A = 0, and ;» “ 2, from 
the t-tabl cs the ) - n percentile point in the t-dist r Ibut.l on 
with h-A df was determined, 

2, By empirical means the actual percent of t- rat Los exceeding 
1 -u l;f-A was found for each when the null hypotheses wore true 
and the variances heterogeneous and the population normal find 
observations independent. , 

3, For iMi'li null hypoLhmls the noujjunl significance level, «/, and 
the actual significance lt s , el vvre then compared, 

A pseudo-random number generator FNHa (drowning l f ’A?) was used to 
generate a normally dint r United population pro! of 3000 numbers with 
mean 0,00000 and variance o Q 2 = 23.10ft, The normality assumption of 
this distribution was tested by the Kolmogorov-Smiruof f test and 
could not he rejected at the ,20 level of significance, Since A = 0 
and A *= 0 , the 23 prctrcatnenl observations (z |,za, *'* 1 * 25 ) and the 
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twenty-five pos t 1 rea i nent observations (?*p6> ? *27 » * * * * 7 '50^ necessary 
for each t-rntio were determined from the same formula 



To obtain each ctj a number was drawn randomly with replacement from 



which was a random normal deviate* from a population with a mean zero 



proofs was repeated until lOOO sets of r-ratios were formed for y = . rf l. 
(In order to compute the t-ratios, parts of the "homo liter 'ronram for 
Analysis of time Series lh;per incut with Possible rhanre in Drift" by 
(». t 1 Glass and i. 0, Maguire was used.) This entire process was repeated 
for y’s of .30, 1,0, and 1,3, As was pointed out earlier in the paper, y 
is not normally known, but its value Is estimated from the posterior 
probability distribution >i ( y J x ) , where 0 < y < 2, The v values ,01, .03, 
1*0, 1.3, used here are distributed over the rnnfte fccnornllv covered 
In practice by this posterior distribution. 

As can he seen from Table I, nine types of variance violation and 
one situation in which the homogeneous variances assumntion was not 
violated Were studied. When the variance level changed It increased or 
doevnused gradually over the lenpth of the prel veal merit or post treatment 
observations . It is r.ot expected that variance chanp.cs would necessari ly 
occur In this smooth, manner, but the situation approximates real 
situation.; closely uuoup.h t o n.d e its use feasible! in this stud,*. 




z t - the observed value of the process at time t. 

1,-0 

u - :■ 

i * .01, .30, 1.0, 1.5. 



the pool and Lhou multiplied by a value , to obtain an value 



and variance, cj'i t-ratlos were determined for 7 - *01, and this 
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Results 



1 lie results for o, change in Jove], are recorded in Table II , and 
the results for A, change in slope, arc recorded in Table III. The y 
level, and the nominal significance level, a, are at the head of each 
column and the actual significance levels are recorded within these 
columns for each of the 10 separate runs. Figure 2 is a set of 3 graphs 
made from l hr; data in 'IVble If, Ka eh graph in for a set nominal 
significance level a, and shows the actual significance level of each run 
versus the 7 values. 



Run 0 in t.’hich the assumption of homogeneity of variance v/ns met was 
done as a chock to insure that the computing system was functioning 
properly. As can be seen from Tables If and III, the differences 
between the nominal levels of significance and the actual levels of 
significance differ no more than what would be expected for a sample 
nf size 1000 , 

Change in l evel ; To facilitate Interpretation of the data 
recorded in Table II, the data have been graphed in Figure 2, and the 
da La are discussed as three separate groups A, 11, and Cl, Kach group 
has a comryton variance trend for and the actual sign! f ienneo levels 
within each group maintain the same general trend. Croup A consists of 
runs numbered 1,2, and 1, Croup ii consists of runs numbered ' 1 , r i, and 6 , 
and Croup C consists of runs numbered 7, G, and , General trends 
for each of the groups A, U, and C , ore Included below. 

Group A had the variance trend for 1 - values 
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(a) As y increased , the actual si g.iif icance level decreased be Jew a. 

(b) As c. increased in magnitude, the actual significance level 
decreased below <c. 

(c) The actual significance level was generally less than cite 
nominal significance level. 

Mroup h had the variance trend for values 

o 0 - ’ c j°o 2 > T > i: fo 2 * r -fo ? U “ 1,1(1 ru " number) 
where Cj differed for each run ] , 

(a) The actual level of significance v /as in all cases larger 

than the nominal significance level (note especially for / = .3 
and larger) 

(10 The peak values of the actual significance level aj pear to occur 
for values near y - 

(c) The actual level of significance increased as the. magnitude of c 
i 1 1 c r ea so d . 

Crouj) C had the variance treml for values 

\/ * c j r, o 1 , T , C j7 0 2 > o o * (j « the run number) 

where differed for each run j. 

The general trend of the actual significance levels for group C 
followed the same pattern an for group 11, and in general were more extreme 
in their deviations from the nominal n» 

One aspect common to all runs was the robustness of the model 
when v - *01, That this was to be expected can he seen from the following!, 
..Tien y = 0, the IMA equation can he simplified to 
z t « I, + ii + H 1 -f o L 

which is the analysts of variance rwulel * tt has previously been shown 
by enperical means that the analysis of variance nodcl fs robust to 
violations of its homogeneity of variance assumption when the treatment 
groups are of etjii.it n !.<*•« Jlifs study approx I m ted the analysis of 
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variance moJel for the situations in which y =■ .01 (y = .01 is a close 
approximation of y - 0) , Since in oil cases treated here, pretreat men t 
observation:} , 11 } * 25, equaled = 25, the number of p >"t treat man t 
observations, it was expected that in the circumstances where y ” .01 
the nominal and actual significance levels would closely compare. 

Change U\ slo pe As As can be seen from Table Til the model is 
remarkably robust for all homogeneity of variance violations studied. 

/.one of the actual levels of significance obtained can be. termed 
significantly different from the nominal level of significance. 

Conclus ion s 

The trends visible from the results lead to the conclusion that if 
possible heterogeneity of error variance is suspected then conservative 
nominal significance levels should he set if the IMA model is to be 
used in determining the effect of a treatment. This Is increasingly 
Important If the variability of the observations appears to be changing 
across time, 

If interest centers on whether or not a treatment V has had an effect 
on the slcpe, nominal significance levels can be chosen without regard 
to the possible changing variability of the observations across time. 

As Table 111 showj , the model Is very robust in this respect, at least 
with regard to all violations testeJ here. 
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For a complete development o£ the IMA model see Aox and Jenkins 
{1970, chapter 4 ) , 

The "integrated moving average model with deterministic drift w.._> 
presented by G.E.P. Cox and G.M. Jenkins of pp. 33-3/. of Models for 
Prediction and Control, HI. Linear Non-stationary Models, nf n^sin 
Ho port Mo. 79 . Madison: Dept, of Statistics, University of J isconstn, 

,] Ui y ( i960. Also see Box and Jenkins (1970). 
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